fix(wasm): align TypeScript CHA dispatch confidence (0.6 → 0.8)#1505
Conversation
… diff Adds snapshot-pre-bash.sh (PreToolUse Bash) + track-bash-writes.sh (PostToolUse Bash): the pre-hook captures git status --porcelain to a per-worktree temp file before each Bash call; the post-hook diffs the before/after state and appends newly modified or created files to .claude/session-edits.log. This closes the gap where files written by sed -i, printf redirects, tee, heredocs, or build tools (Cargo.lock, lockfiles) were never recorded, causing guard-git.sh to emit false-positive BLOCKED errors. Closes #1457
- clojure.rs: annotate lifetime-anchor assignment to silence false-positive - cfg.rs: remove never-called start_line_of method - complexity.rs: remove never-constructed NotHandled variant; convert irrefutable if-let patterns to plain let destructures - dataflow.rs: remove never-read callee fields from CallReturn/Destructured - incremental.rs: remove never-read lang field from CacheEntry cargo check and cargo clippy both clean after these changes.
Adds .github/workflows/perf-canary.yml — a path-filtered workflow that fires on PRs touching src/extractors/, src/domain/graph/, or crates/** and runs only the incremental-benchmark suite (full build + no-op + 1-file rebuild, both engines). Catches the class of regressions that accumulated invisibly across the Phase 8.x PRs and were only detected at v3.12.0 publish time. The regression guard gains BENCH_CANARY=1 mode: raises thresholds to 50%/100%/150% (standard/noisy/WASM) and skips the build, query, and resolution suites — only incremental checks run. This absorbs shared- runner timing variance while still blocking catastrophic regressions (+98% full build, +1827% 1-file rebuild from v3.12.0). Closes #1433
On incremental builds, runPostNativeCha previously scanned all call→qualified-method edges in the DB (~12ms flat, O(graph size)), even for 1-file changes where no hierarchy or RTA evidence changed. Add two cheap indexed gate queries. Gate A checks whether any changed file introduced a class/interface/trait/struct/record node (hierarchy may have new implementors reachable from unchanged call sites). Gate B checks whether any changed file added a call edge to a class-kind target (RTA set may have grown, enabling previously filtered expansions in unchanged callers). If neither gate fires, restrict the candidate query to src.file IN changedFiles — safe because the hierarchy and instantiated set are unchanged for all other files. Full builds (isFullBuild=true) and cases where either gate fires retain the existing full-scan behaviour. Mirrors the changed-files scoping pattern of runPostNativeThisDispatch. Closes #1441
Times each JS post-pass in tryNativeOrchestrator and exposes the
measurements in BuildResult.phases:
- gapDetectMs — dropped-language gap detection + backfill
- chaMs — CHA expansion (interface dispatch)
- thisDispatchMs — this/super dispatch WASM re-parse (was already
tracked but now properly named alongside the rest)
- reclassifyMs — scoped role re-classification after edge insertion
- techniqueBackfillMs — technique-column UPDATE on native-written edges
Previously only thisDispatchMs was reported, causing wall-clock vs
phaseSum to diverge by 1.1s+ on 1-file rebuilds and making benchmark
regressions undiagnosable from committed history.
Updates update-incremental-report.ts to render the new phases in a
collapsible details block under each engine's 1-file rebuild section.
Closes #1434
…ld for required-tier grammars The docstring claimed pool cost was "amortised over enough parse work" — measurements show IPC overhead scales linearly (~55–64ms/file pool vs ~8–10ms/file inline). The real motivation is crash safety for exotic WASM grammars (#965); JS/TS/TSX (required-tier, used in all this-dispatch backfill calls) have never triggered the V8 fatal crash class and are safe to run inline. Raise threshold 16 → 32 to keep typical this-dispatch batches (≤ 18 files on the codegraph corpus) on the inline fast path. Exotic-language drops are almost always well under 32 files and also benefit from the inline path without meaningful crash risk increase. Closes #1435
…e incremental rebuilds On 1-file native incremental builds, two JS post-passes ran unconditionally even when they had no work to do: - `backfillNativeDroppedFiles`: called whenever changedCount > 0, even when detectDroppedLanguageGap returned an empty gap. Gate now checks gap.missingAbs.length > 0 || gap.staleRel.length > 0 directly, matching backfillNativeDroppedFiles's own internal early-exit guard. - Node/edge COUNT(*) re-count: ran unconditionally after all post-passes even when none of them wrote any edges. COUNT(*) over 50K+ edge tables is non-trivial, especially via the NativeDbProxy napi-rs round-trip. Now gated on postPassWroteData (backfill | CHA edges | this-dispatch edges). Closes #1454
The post-pass it timed (runPostNativePrototypeMethods) was deleted in b5c03a2 when func-prop extraction moved to Rust (#1432). The optional field was never set by any code path that survived the deletion. Also remove the stale reference to "prototype-methods post-pass" from the parseFilesWasmForBackfill docstring — only the this-dispatch post-pass uses symbolsOnly now. Closes #1432
… collision Field type annotations (`private repo: OrderRepository`) were seeded as bare file-wide typeMap keys, causing `this.repo` inside `UserService` to resolve to `OrderRepository` when both classes had a `repo` field (issue #1458). Both extractors (TS `handleFieldDefTypeMap` and Rust `field_definition` branch) now seed `ClassName.field` keys at confidence 0.9, matching the `CallerClass.X` resolver fallback added in PR #1382. Bare keys are kept at confidence 0.6 as fallbacks for single-class files or class expressions where no enclosing class name is available. Both engines change identically — parity preserved.
…ed names
The resolution benchmark uses WASM-built graphs where the Elixir, Julia,
and Objective-C extractors emit module-qualified symbol names (Main.run,
App.main, UserService.create_user, etc.). The expected-edges manifests
were written with bare unqualified names (run, main, create_user), so
every correctly-resolved edge appeared as a false positive and every
expected edge appeared as a false negative — causing all three languages
to show 0% precision even though resolution was working correctly.
Root cause: starting in v3.12.0, cross-module call resolution began working
for these languages (via the improved receiver-dispatch and same-class
fallback in resolveByMethodOrGlobal / build-edges.ts). With 0 edges
previously resolved, the name mismatch was invisible; once edges started
resolving, the manifests showed 17 FP (elixir), 11 FP (julia), 6 FP
(objc) — all correctly resolved edges misidentified as false positives.
Fix:
- Update all three expected-edges.json manifests to use the
module-qualified names matching actual extractor output:
elixir: Main.run, UserService.create_user, Validators.validate_user, etc.
julia: App.main, Service.create_user, Repository.new_repo, etc.
objc: full ObjC selectors (createUserWithId:name:email:, isValidEmail:, etc.)
plus add main -> run (plain C call correctly resolved)
- Ratchet THRESHOLDS for all three:
elixir: precision 0.0 -> 1.0, recall 0.0 -> 0.8 (17/21 resolved)
julia: precision 0.0 -> 1.0, recall 0.0 -> 0.7 (11/15 resolved)
objc: precision 0.0 -> 1.0, recall 0.0 -> 0.4 (6/13 resolved)
Remaining FNs are genuine unresolved edges (same-file bare calls in
elixir/julia, receiver-typed message sends in objc) — not regressions.
Closes #1447
The JS C++ and CUDA extractors had no handler for 'declaration' AST nodes,
so typeMap was never seeded for statically-typed locals (e.g. 'UserService svc;').
Without a typeMap entry for 'svc', resolveReceiverEdge had nothing to look up and
silently skipped the receiver edge.
Add handleCppDeclaration / handleCudaDeclaration to both extractors. They mirror
match_c_family_type_map ('declaration' branch) from the native Rust path: extract
the type node text and seed typeMap[varName] = { type, confidence: 0.9 } for each
identifier or init_declarator child. Primitive types (int, char, bool, …) are
skipped to avoid spurious edges.
parity-compare.mjs --langs cpp,cuda --hybrid: PARITY OK (wasm = native = hybrid)
All 3044 tests pass.
… in Rust solver
Go extractor was only seeding typeMap for var_spec and parameter_declaration,
missing short_var_declaration. Added infer_short_var_types to handle:
- x := Struct{} → conf 1.0 (composite literal)
- x := &Struct{} → conf 1.0 (address-of composite)
- x := NewFoo() / x := pkg.NewFoo() → conf 0.7 (New* factory prefix)
Python extractor was only seeding typeMap for typed_parameter and
typed_default_parameter, missing plain assignment. Added
infer_py_assignment_type to handle:
- order = Order(...) → conf 1.0 (uppercase constructor)
- obj = Module.Class(...) → conf 0.7 (uppercase module prefix, non-builtin)
Both mirror the existing JS extractors exactly. Parity check for
go and python: wasm vs native/hybrid OK.
…l, zig) Both engines used different rules for attributing calls inside variable bindings: WASM: attributed to the narrowest enclosing span regardless of kind, so local variable declarations inside fn main() shadowed the enclosing function (Zig: calls attributed to repo/svc variables instead of main), and nested let-bindings inside a Haskell do-block shadowed the top-level main binding. Native: loaded allNodes from a query that excluded 'variable' kind, so top-level Haskell bind nodes (main = do …, kind='variable') never matched in defs_with_ids, causing all calls to fall back to the file node. Unified rule implemented in findCaller (TS) and find_enclosing_caller (Rust): - Function/method definitions are preferred over any variable/constant binding as the enclosing caller scope — local var declarations inside a function body never shadow the enclosing function (fixes Zig repo/svc attribution). - When no function/method encloses the call, fall back to the WIDEST (outermost) variable/constant binding — this handles Haskell where main is a top-level bind node with kind 'variable'. Widest span is used so that nested let-bindings do not shadow the outer main binding. - File node remains the absolute last resort. Also adds 'variable' to NODE_KIND_FILTER_SQL (JS) and EDGE_NODE_KIND_FILTER (Rust pipeline.rs) so top-level variable bindings are included in the allNodes set available for caller matching. parity-compare.mjs --langs haskell,zig --hybrid: PARITY OK — 2/2 fixtures.
…and test Remove unused TypeMapEntry import from cpp.ts and cuda.ts, reformat primitive-type Set literals and test expect() calls to satisfy biome line-length rules.
Java was the only fixture where all three build paths (wasm, native, hybrid) disagreed pairwise. Bug 1 — WASM typeMap pollution: `handleJavaLocalVarDecl` used last-wins Map.set(), so the local `InMemoryUserRepository repo` in the static `createDefault()` method silently overrode the constructor parameter `UserRepository repo`. This caused WASM to bypass the interface and resolve directly to the concrete class, producing no interface edge and the wrong receiver. Fix: switch to first-wins `setTypeMapEntry` to match Rust extractor semantics. First-wins preserves the interface annotation that drives correct CHA dispatch. Bug 2 — native vs wasm/hybrid confidence mismatch: `runPostNativeCha` (native orchestrator path) used `computeConfidence − CHA_DISPATCH_PENALTY = 0.7 − 0.1 = 0.6`, while `runChaPostPass` (DB post-pass used by wasm and hybrid) hardcodes 0.8. Fix: align `runPostNativeCha` to also use 0.8. Result: all three build paths now emit identical edges and confidences. `parity-compare.mjs --langs java --hybrid` passes. Updated expected-edges.json to include both the interface declaration edge (TypeRepository.X at 0.7) and the CHA-expanded impl edge (InMemoryUserRepository.X at 0.8), which are the correct semantics for an interface-typed receiver. Closes #1469
The inline CHA expansion in buildCallEdges and buildChaPostPass used computeConfidence(relPath, t.file) - CHA_DISPATCH_PENALTY for all CHA targets, producing 0.6 for cross-directory interface dispatch (same-dir = 0.7, minus 0.1 penalty). runChaPostPass (helpers.ts) and runPostNativeCha (native-orchestrator.ts) both hardcode 0.8 for interface/CHA-dispatch edges. The deduplication in runChaPostPass uses the existing DB edge as-is and skips reinsertion, so the 0.6 edges from the inline pass were never upgraded to 0.8. Fix: typed-receiver (interface) dispatch branches now use hardcoded 0.8 matching the post-pass constants. The this/super branch keeps computeConfidence-based proximity scoring to remain aligned with runPostNativeThisDispatch. parity-compare.mjs --langs typescript --hybrid goes green (was 12 edge diffs). Closes #1470 docs check acknowledged
|
Claude finished @carlos-alm's task —— View job PR Review: fix(wasm): align TypeScript CHA dispatch confidence (0.6 → 0.8)
Root Cause AnalysisProblem Being Solved: The WASM inline CHA dispatch path ( Why This Approach: Hardcoding Risk Assessment: Low. The change is surgical (two lines), the logic is narrowly scoped to Phase 2: Configuration Change DetectionNo configuration changes. No linting rules relaxed, no TypeScript flags disabled, no test timeouts altered. ✓ Phase 3: Test IntegrityNo test files were modified. The PR description reports 100% precision/recall on the TypeScript benchmark and 178/179 passing (pre-existing Erlang failure filed separately as #1504). ✓ Phase 4: Code Quality
const conf = isTypedReceiverDispatch
? 0.8
: computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY;Two minor observations (neither blocking):
No bugs introduced. No security concerns. No SQL changes. No new dependencies. Backlog Compliance
Critical ConcernsNone. Final Recommendation
|
Greptile SummaryThis PR fixes a confidence-value divergence between the three CHA dispatch engine paths (WASM inline, WASM post-pass, native post-pass) for typed-receiver (interface) dispatch edges. It consolidates Confidence Score: 5/5Safe to merge — the effective confidence values written to the DB are unchanged for the native and hybrid paths, and the WASM inline path is corrected from 0.6 to 0.8, aligning all three engines. The change touches only confidence constant plumbing with no control-flow or schema changes. The this/super dispatch branch is intentionally kept on computeConfidence − penalty to align with runPostNativeThisDispatch, and the typed-receiver branch is uniformly updated across all three sites. The only leftover is an orphaned export that carries no runtime risk. helpers.ts still exports the now-unused CHA_DISPATCH_CONFIDENCE at line 372; worth removing to avoid future confusion. Important Files Changed
Sequence DiagramsequenceDiagram
participant W as WASM inline (build-edges.ts)
participant WP as WASM post-pass (helpers.ts)
participant NP as Native post-pass (native-orchestrator.ts)
participant DB as SQLite DB
note over W,NP: Before this PR — typed-receiver CHA edges
W->>DB: "insert edge, conf = computeConfidence − 0.1 → 0.6 (cross-dir)"
WP->>DB: skip (edge already exists), 0.8 never applied
NP->>DB: skip (edge already exists), 0.8 never applied
note over W,NP: After this PR — all three engines agree
W->>DB: "insert edge, conf = CHA_TYPED_DISPATCH_CONFIDENCE = 0.8"
WP->>DB: "skip or insert, conf = CHA_TYPED_DISPATCH_CONFIDENCE = 0.8"
NP->>DB: "skip or insert, conf = CHA_TYPED_DISPATCH_CONFIDENCE = 0.8"
Reviews (6): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile |
Codegraph Impact Analysis4 functions changed → 7 callers affected across 5 files
|
…TCH_CONFIDENCE constant Switch handleCppDeclaration and handleCudaDeclaration from last-wins ctx.typeMap.set() to first-wins setTypeMapEntry(), fixing the same flat typeMap pollution bug this PR corrects in java.ts. Extract the CHA dispatch confidence value to a named CHA_DISPATCH_CONFIDENCE constant in helpers.ts so runChaPostPass and the native orchestrator share a single source of truth instead of two synchronized magic numbers.
Move CHA_DISPATCH_PENALTY to helpers.ts alongside the new CHA_TYPED_DISPATCH_CONFIDENCE = 0.8 constant so all four CHA dispatch sites (helpers.ts runChaPostPass, build-edges.ts buildChaPostPass, build-edges.ts buildFileCallEdges, native-orchestrator.ts runPostNativeCha) reference a single named export instead of repeating the literal. native-orchestrator.ts now imports both constants from helpers.ts, removing the build-edges.ts import.
|
Addressed the magic-number concern raised by both reviewers: extracted |
|
Merge commit 2a955b5 synced the branch with the upstream base (fix/java-interface-dispatch-1469). No logic change. |
Summary
buildCallEdgesandbuildChaPostPassusedcomputeConfidence - CHA_DISPATCH_PENALTYfor typed-receiver (interface) dispatch, producing0.6for cross-directory calls (same-dir confidence 0.7, minus 0.1 penalty)runChaPostPass(helpers.ts) andrunPostNativeCha(native-orchestrator.ts) both hardcode0.8for interface/CHA-dispatch edges — but their deduplication skips reinsertion of already-existing edges, so the0.6edges from the inline pass were never upgradedresolveChaTargets) branch now uses hardcoded0.8; thethis/super(resolveThisDispatch) branch retainscomputeConfidence - penaltyto stay aligned withrunPostNativeThisDispatchWhat was already fixed by #1503
Sub-divergence 2 as originally described (hybrid=0.8, wasm+native=0.6) was actually inverted by the time this branch was created: native and hybrid were already at 0.8 (from
runPostNativeCha's hardcoded 0.8), and WASM was the outlier at 0.6. No changes to the native Rust engine were needed.Sub-divergence 1 (abstract-method CHA 0.9 vs 0.8) does not appear in the current parity output — hierarchy.ts same-file dispatch is consistent across all engines at 0.9, which is correct (same-file
computeConfidence= 1.0 minus penalty = 0.9, andrunPostNativeThisDispatchalso produces 0.9 for same-file targets).Test plan
parity-compare.mjs --langs typescript --hybridgoes green (was 12 edge diffs, now 0)Closes #1470